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Abstract. The Exercise, an Exercise generator in the SOURCe project, is a tool 
that complements the properties and functionalities of the SOURCe project, which 
includes the search engine for the Searchable Online French-Greek parallel corpus 
for the UniveRsity of Cyprus (SOURCe) (Kakoyianni-Doa & Tziafa, 2013), the 
PENCIL (an alignment tool) (Kakoyianni-Doa, Antaris, & Tziafa, 2013), and the 
Synonyms and the Library tools. These are designed as freely available resources for 
language processing, easy to use by teachers and learners. The Exercise tool enables 
teachers to create either online activities or print out paper-based worksheets, 
including a variety of texts and activities on topics, in a variety of exercise types 
(e.g. multiple choice, word, phrase or sentence matching, filling the gaps with 
missing words/phrases, text reconstruction, listening, etc.). 

Keywords: exercise generator, computer-based language testing, teachers’ tool, 
language acquisition. 


1. Introduction 

The SOURCe project aims at providing a whole platform of tools to assist the 
French language teachers in the classroom. In this paper we present the latest tool 
called Exercise, an Exercise generator in the SOURCe project. This is the fifth tool 
of the SOURCe project, which consists of: 

• The Source 4 Corpus tool, which is a search engine for the searchable online 
French-Greek parallel corpus for the university of Cyprus (Kakoyianni- 
Doa & Tziafa, 2013). 
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• The PENCIL tool, an alignment tool for translators, teachers and learners, 
which enables the creation and customisation of web-based corpora, by 
uploading and aligning French and Greek texts (Kakoyianni-Doa et ah, 
2013). 

• The Synonyms tool, which provides a search engine for synonyms, based 
on machine translation or dictionary searches. 

• The Library tool, which showcases a part of the SOURCe project corpus. 

• The Exercise tool (presented in the following sections), which enables 
teachers to create online activities and/or print out paper-based worksheets. 

Following Kakoyianni-Doa and Tziafa’s (2013) methodology, “all these tools 
and functionalities are designed as freely available resources for language 
processing, along with the data to be processed, in usable formats for both 
teachers and learners” (p. 2) (Figure 1). As regards the parallel corpora, “the 
translations included are based on human understanding of textual relations, 
which is not the case for machine translation (yet), despite the fact that students 
tend to rely more and more on it” (Kakoyianni-Doa & Tziafa, 2013, p. 3). The 
time period covered by the corpus spans over six centuries, from the 15th to 
21st century. The texts under study are instances of different domain-specific 
registers, so that students may compare the results and the use of each word or 
phrase in different contexts. The corpus consists of a fiction and a non-fiction part 
of 720,282 aligned sentences. 


Figure 1. The SOURCe project’s home page 


WELCOME TO THE SOURCE CORPUS! 


University 
of Cyprus 


^ JL l*j — » o 

source pencil library synonyms exercises 


^library 



The Source Corpus is a French-Greek 
parallel corpus, created for the 
University of Cyrpus. in order to 
facilitate French language learning for 
Greek students It includes the Search 
area of the Corpus, the Pencil Tool, the 
Library Tool, the Search area of the 
Synonyms and the Exercises Tool. 
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In the following sections we focus on the construction and composition of the 
Exercise tool, based on a parallel corpus, “the content, annotation, encoding 
and availability of which are meant to serve the needs of teachers and students 
of French or Greek as a foreign language and also to facilitate future linguistic 
research” (Kakoyianni-Doa & Tziafa, 2013, p. 1). Moreover, we outline its future 
perspectives and applications. 


2. Method 

2.1. The Source Corpus as a basis 

Following Kakoyianni-Doa and Tziafa (2013) methodology, the core of the 
project is a collection of parallel corpora, either already existing or created by the 
participating researchers, and aligned (at sentence level) original and translated 
texts, in French and Greek. The corpus development is an ongoing process, with 
new texts constantly added. According to previous research, “the use of corpora in 
the classroom can have remarkable results as regards foreign language learning” 
(Kakoyianni-Doa & Tziafa, 2013, p. 2; see also Hadley, 2002). Moreover, as 
Kilgarriff (2009) suggests, parallel corpora are easier to be disguised as dictionaries 
and be brought for use in the classroom. The corpus consists of different registers 
(Biber, 1993), in order to facilitate comparison of the results and to study the use 
of each word or phrase in a different context (e.g. literature, scientific, official, 
technical and journalistic language). Commonly used parallel corpora, mainly 
EUROPARF (Koehn, 2005) and the JRC Acquis corpus (Steinberger et al., 2006) 
from the Opus open parallel corpus (Tiedemann, 2012) were also used, as were 
literary works available from Project Gutenberg, or scanned, mainly from the local 
historical library Severeios. 

2.2. The Exercise tool construction 

In order to support the use of corpus linguistic tools by teachers and learners with 
no previous expertise, we designed a simple interface through which the user may 
search existing corpora, upload texts, and work online with interactive exercises. 
The Exercise tool enables teachers to create either online activities or print out 
paper-based worksheets, including a variety of texts and activities on topics, in a 
variety of exercise types (e.g. multiple choice, word, phrase or sentence matching, 
filling the gaps with missing words/phrases, text reconstruction, listening, etc.). 
Throughout the SOURCe project, and especially in the Exercise tool, we used Java 
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(Enterprise Edition) for the backend and HTML5, CSS3 and Javascript for the 
frontend and Apache Solr for indexing, searching and sentence fetching based on 
a given similarity distance (used in the multiple choice exercise for the automated 
text selection feature, presented in the next section). 


3. The Exercise tool 


On the main page of the tool (Figure 2) there is a table listing the existing exercises 
and four buttons to create a new exercise (if the user is a teacher or administrator) 
or to filter the listed exercises in the table according to the exercise type (if the user 
is a student). 


Figure 2. The exercises page (from the teacher homepage) 



^exercises 


£J X 



Fill the Gap Mix & Match Jui 

Available Exercises 

ID* Type Title Created By 

67 AllTheGap Completez les mots manquants: Tziafa Eleni 

68 jumbleWordPuzzle Put tne words in the correct order to form a sentence. Tziafa Eleni 

71 mixAndMatch The following sentences are mixed up, put them in the correct order! Tziafa Eleni 

72 muldpleChoice Choose one or more correct translations. Tziafa Eleni 


All the exercises offer (1) a common construction interface to provide the title 
of the exercise (in Greek and French), (2) a description, which is available only 
to the creator of the exercise, to take notes about the specific exercise, and (3) an 
automated text selection feature, which automatically fetches text from the parallel 
corpora collection, based on the user selection. The difference in the common 
automated text selection interface is that in the multiple choice exercise the 
teacher can select the number of suggested returned translations of every sentence 
based on a distance metric provided by the Apache Solr tool. The user can edit 
the automatically fetched text of the exercise or provide the text manually with a 
simple interface. All the exercises can be combined with an audio file, to produce a 
listening exercise. The user selects an MP3, OGG or WAV audio file and the number 
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of the allowed replays. The exercises can be solved either online or be given as a 
printout (even in cases that an audio file is attached, in order to accompany the 
audio file played in classroom). The combinations are innumerable. We can edit 
the French part, the Greek part or both, thus providing scalable difficulty levels. 

When students are solving the exercise, they are able to check the results or have 
the solution displayed (Figure 3). 


Figure 3. Solution display when the student solves a jumble words puzzle 



4. Conclusions 

In this paper we have presented the latest tool of the SOURCe project, the Exercise 
tool, which offers to French language teachers a friendly interface for online or 
printout exercise creation, and to students an online platform for practice. There 
are plenty of online exercise generators available, but they apply mostly in high- 
resourced languages, like English, in monolingual texts, and most of them are not 
free. There are very few proposals which apply on other languages (e.g. Swedish: 
Volodina & Borin, 2012). 

Unlike other tools, the Exercise tool supports the low-resourced French and 
Greek languages in aligned texts. Moreover, it offers an online practice platform 
for students solving exercises online and checking their results in an automated 
manner. Finally, it is offered freely (registration required) and its source code and 
its assets are available under the Creative Commons license. 

Our future plans include the addition of more types of texts and also exercises, along 
with experimental evaluations and questionnaires, in order to testify the attitude of 
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users toward the provided tools and also encourage teachers and students to use 
these resources in the classroom or beyond. 

Our objective is to develop a whole platform of tools that will assist teachers to 
find out about, adapt and apply new tools in the classroom. To overcome the well- 
known problem of the existing natural language processing tools and resources not 
actually being included in the language learning procedure, despite their potential 
as learning and research tools, the main goal of the proposed project is to provide 
language instructors and learners with ready-made corpora and corpus-based 
exercises, available for use in a new learning environment. The platform provides 
innovative corpus-based learning activities and interactive exercises. This study 
could also serve as a pilot study for the creation of multilingual resources in the 
form of parallel corpora. This project is thus intended not only to fill a gap in the 
literature on corpora used in the classroom, but also to make available valuable 
resources, especially for a low resourced language such as Greek. 
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